Skip to content

[mlir][core] Add an MLIR "pattern catalog" generator #146228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

j2kun
Copy link
Contributor

@j2kun j2kun commented Jun 28, 2025

This PR adds a feature that, when enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to llvm::dbgs. When the MLIR test suite is run, these debug outputs can be filtered and combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.

The feature is enabled when MLIR is combined in debug mode, though a previous version of this PR used a CMake build flag to trigger the behavior. The debug logs emitted can be viewed with --debug-only=generate-pattern-catalog, and the lit config is modified to do this when the env var MLIR_GENERATE_PATTERN_CATALOG is set.

Example usage:

mkdir build && cd build
cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS="mlir" \
  -DLLVM_TARGETS_TO_BUILD="host" \
  -DCMAKE_BUILD_TYPE=DEBUG
ninja -j 24 check-mlir
MLIR_GENERATE_PATTERN_CATALOG=1 bin/llvm-lit -j 24 -v -a tools/mlir/test | grep 'CatalogingListener: ' | sed 's/^# | CatalogingListener://g' | sort | uniq > pattern_catalog.txt

Sample pattern catalog output (that fits in a gist): https://gist.github.com/j2kun/02d1ab8d31c10d71027724984c89905a

[edit]: removed use of RTTI, removed cmake build flag

This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR.
When enabled, it attaches a listener to all RewritePatterns that emits
the name of the operation being modified to a special file. When the
MLIR test suite is run, these files can be combined into an index
linking operations to the patterns that insert, modify, or replace them.
This index is intended to be used to create a website that allows one
to look up patterns from an operation name.
@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir bazel "Peripheral" support tier build system: utils/bazel labels Jun 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2025

@llvm/pr-subscribers-mlir-core

Author: Jeremy Kun (j2kun)

Changes

This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR. When enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to a special file. When the MLIR test suite is run, these files can be combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.


Full diff: https://github.com/llvm/llvm-project/pull/146228.diff

5 Files Affected:

  • (modified) mlir/CMakeLists.txt (+19)
  • (modified) mlir/cmake/modules/MLIRConfig.cmake.in (+1)
  • (modified) mlir/include/mlir/IR/PatternMatch.h (+74)
  • (modified) mlir/lib/Rewrite/PatternApplicator.cpp (+47-2)
  • (modified) utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (+1)
diff --git a/mlir/CMakeLists.txt b/mlir/CMakeLists.txt
index 44493b75b8a8c..bd30d94e1ccb4 100644
--- a/mlir/CMakeLists.txt
+++ b/mlir/CMakeLists.txt
@@ -202,6 +202,24 @@ if(MLIR_ENABLE_BINDINGS_PYTHON)
   mlir_configure_python_dev_packages()
 endif()
 
+#-------------------------------------------------------------------------------
+# MLIR Pattern Catalog Generator Configuration
+# Requires:
+#   RTTI to be enabled (set with -DLLVM_ENABLE_RTTI=ON)
+# When enabled, causes all rewriter patterns to dump their type names and the
+# names of affected operations, which can be used to build a search index
+# mapping operations to patterns.
+#-------------------------------------------------------------------------------
+
+set(MLIR_ENABLE_CATALOG_GENERATOR 0 CACHE BOOL
+       "Enables construction of a catalog of rewrite patterns.")
+
+if (MLIR_ENABLE_CATALOG_GENERATOR)
+  message(STATUS "Enabling MLIR pattern catalog generator")
+  add_definitions(-DMLIR_ENABLE_CATALOG_GENERATOR)
+  add_definitions(-DLLVM_ENABLE_RTTI)
+endif()
+
 set(CMAKE_INCLUDE_CURRENT_DIR ON)
 
 include_directories(BEFORE
@@ -322,3 +340,4 @@ endif()
 if(MLIR_STANDALONE_BUILD)
   llvm_distribution_add_targets()
 endif()
+
diff --git a/mlir/cmake/modules/MLIRConfig.cmake.in b/mlir/cmake/modules/MLIRConfig.cmake.in
index 71f3e028b1e88..f4ae70a22b3d2 100644
--- a/mlir/cmake/modules/MLIRConfig.cmake.in
+++ b/mlir/cmake/modules/MLIRConfig.cmake.in
@@ -16,6 +16,7 @@ set(MLIR_IRDL_TO_CPP_EXE "@MLIR_CONFIG_IRDL_TO_CPP_EXE@")
 set(MLIR_INSTALL_AGGREGATE_OBJECTS "@MLIR_INSTALL_AGGREGATE_OBJECTS@")
 set(MLIR_ENABLE_BINDINGS_PYTHON "@MLIR_ENABLE_BINDINGS_PYTHON@")
 set(MLIR_ENABLE_EXECUTION_ENGINE "@MLIR_ENABLE_EXECUTION_ENGINE@")
+set(MLIR_ENABLE_CATALOG_GENERATOR "@MLIR_ENABLE_CATALOG_GENERATOR@")
 
 set_property(GLOBAL PROPERTY MLIR_ALL_LIBS "@MLIR_ALL_LIBS@")
 set_property(GLOBAL PROPERTY MLIR_DIALECT_LIBS "@MLIR_DIALECT_LIBS@")
diff --git a/mlir/include/mlir/IR/PatternMatch.h b/mlir/include/mlir/IR/PatternMatch.h
index 10cfe851765dc..141b3c6806ed8 100644
--- a/mlir/include/mlir/IR/PatternMatch.h
+++ b/mlir/include/mlir/IR/PatternMatch.h
@@ -475,6 +475,80 @@ class RewriterBase : public OpBuilder {
     RewriterBase::Listener *rewriteListener;
   };
 
+  struct CatalogingListener : public RewriterBase::ForwardingListener {
+    CatalogingListener(OpBuilder::Listener *listener,
+                       const std::string &patternName, raw_ostream &os,
+                       std::mutex &writeMutex)
+        : RewriterBase::ForwardingListener(listener), patternName(patternName),
+          os(os), writeMutex(writeMutex) {}
+
+    void notifyOperationInserted(Operation *op, InsertPoint previous) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationInserted"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationInserted(op, previous);
+    }
+
+    void notifyOperationModified(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationModified"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationModified(op);
+    }
+
+    void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with op)"
+           << " | " << op->getName() << " | " << newOp->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, newOp);
+    }
+
+    void notifyOperationReplaced(Operation *op,
+                                 ValueRange replacement) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with values)"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, replacement);
+    }
+
+    void notifyOperationErased(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationErased"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationErased(op);
+    }
+
+    void notifyPatternBegin(const Pattern &pattern, Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyPatternBegin"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyPatternBegin(pattern, op);
+    }
+
+  private:
+    const std::string &patternName;
+    raw_ostream &os;
+    std::mutex &writeMutex;
+  };
+
   /// Move the blocks that belong to "region" before the given position in
   /// another region "parent". The two regions must be different. The caller
   /// is responsible for creating or updating the operation transferring flow
diff --git a/mlir/lib/Rewrite/PatternApplicator.cpp b/mlir/lib/Rewrite/PatternApplicator.cpp
index 4a12183492fd4..c66aaf267881f 100644
--- a/mlir/lib/Rewrite/PatternApplicator.cpp
+++ b/mlir/lib/Rewrite/PatternApplicator.cpp
@@ -15,8 +15,19 @@
 #include "ByteCode.h"
 #include "llvm/Support/Debug.h"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/raw_ostream.h"
+#include <cxxabi.h>
+#include <mutex>
+#endif
+
 #define DEBUG_TYPE "pattern-application"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+static std::mutex catalogWriteMutex;
+#endif
+
 using namespace mlir;
 using namespace mlir::detail;
 
@@ -152,6 +163,16 @@ LogicalResult PatternApplicator::matchAndRewrite(
   unsigned anyIt = 0, anyE = anyOpPatterns.size();
   unsigned pdlIt = 0, pdlE = pdlMatches.size();
   LogicalResult result = failure();
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+  std::error_code ec;
+  llvm::raw_fd_ostream catalogOs("pattern_catalog.txt", ec,
+                                 llvm::sys::fs::OF_Append);
+  if (ec) {
+    op->emitError("Failed to open pattern catalog file: " + ec.message());
+    return failure();
+  }
+#endif
+
   do {
     // Find the next pattern with the highest benefit.
     const Pattern *bestPattern = nullptr;
@@ -206,14 +227,38 @@ LogicalResult PatternApplicator::matchAndRewrite(
           } else {
             LLVM_DEBUG(llvm::dbgs() << "Trying to match \""
                                     << bestPattern->getDebugName() << "\"\n");
-
             const auto *pattern =
                 static_cast<const RewritePattern *>(bestPattern);
-            result = pattern->matchAndRewrite(op, rewriter);
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            OpBuilder::Listener *oldListener = rewriter.getListener();
+            int status;
+            const char *mangledPatternName = typeid(*pattern).name();
+            char *demangled = abi::__cxa_demangle(mangledPatternName, nullptr,
+                                                  nullptr, &status);
+            std::string demangledPatternName;
+            if (status == 0 && demangled) {
+              demangledPatternName = demangled;
+              free(demangled);
+            } else {
+              // Fallback in case demangling fails.
+              demangledPatternName = mangledPatternName;
+            }
+
+            RewriterBase::CatalogingListener *catalogingListener =
+                new RewriterBase::CatalogingListener(
+                    oldListener, demangledPatternName, catalogOs,
+                    catalogWriteMutex);
+            rewriter.setListener(catalogingListener);
+#endif
+            result = pattern->matchAndRewrite(op, rewriter);
             LLVM_DEBUG(llvm::dbgs()
                        << "\"" << bestPattern->getDebugName() << "\" result "
                        << succeeded(result) << "\n");
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            rewriter.setListener(oldListener);
+            delete catalogingListener;
+#endif
           }
 
           // Process the result of the pattern application.
diff --git a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
index 23d89f41a3a45..281b2566304b0 100644
--- a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
@@ -44,6 +44,7 @@ expand_template(
         "@MLIR_ENABLE_SPIRV_CPU_RUNNER@": "0",
         "@MLIR_ENABLE_VULKAN_RUNNER@": "0",
         "@MLIR_ENABLE_BINDINGS_PYTHON@": "0",
+        "@MLIR_ENABLE_CATALOG_GENERATOR@": "0",
         "@MLIR_RUN_AMX_TESTS@": "0",
         "@MLIR_RUN_ARM_SVE_TESTS@": "0",
         "@MLIR_RUN_ARM_SME_TESTS@": "0",

@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2025

@llvm/pr-subscribers-mlir

Author: Jeremy Kun (j2kun)

Changes

This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR. When enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to a special file. When the MLIR test suite is run, these files can be combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.


Full diff: https://github.com/llvm/llvm-project/pull/146228.diff

5 Files Affected:

  • (modified) mlir/CMakeLists.txt (+19)
  • (modified) mlir/cmake/modules/MLIRConfig.cmake.in (+1)
  • (modified) mlir/include/mlir/IR/PatternMatch.h (+74)
  • (modified) mlir/lib/Rewrite/PatternApplicator.cpp (+47-2)
  • (modified) utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (+1)
diff --git a/mlir/CMakeLists.txt b/mlir/CMakeLists.txt
index 44493b75b8a8c..bd30d94e1ccb4 100644
--- a/mlir/CMakeLists.txt
+++ b/mlir/CMakeLists.txt
@@ -202,6 +202,24 @@ if(MLIR_ENABLE_BINDINGS_PYTHON)
   mlir_configure_python_dev_packages()
 endif()
 
+#-------------------------------------------------------------------------------
+# MLIR Pattern Catalog Generator Configuration
+# Requires:
+#   RTTI to be enabled (set with -DLLVM_ENABLE_RTTI=ON)
+# When enabled, causes all rewriter patterns to dump their type names and the
+# names of affected operations, which can be used to build a search index
+# mapping operations to patterns.
+#-------------------------------------------------------------------------------
+
+set(MLIR_ENABLE_CATALOG_GENERATOR 0 CACHE BOOL
+       "Enables construction of a catalog of rewrite patterns.")
+
+if (MLIR_ENABLE_CATALOG_GENERATOR)
+  message(STATUS "Enabling MLIR pattern catalog generator")
+  add_definitions(-DMLIR_ENABLE_CATALOG_GENERATOR)
+  add_definitions(-DLLVM_ENABLE_RTTI)
+endif()
+
 set(CMAKE_INCLUDE_CURRENT_DIR ON)
 
 include_directories(BEFORE
@@ -322,3 +340,4 @@ endif()
 if(MLIR_STANDALONE_BUILD)
   llvm_distribution_add_targets()
 endif()
+
diff --git a/mlir/cmake/modules/MLIRConfig.cmake.in b/mlir/cmake/modules/MLIRConfig.cmake.in
index 71f3e028b1e88..f4ae70a22b3d2 100644
--- a/mlir/cmake/modules/MLIRConfig.cmake.in
+++ b/mlir/cmake/modules/MLIRConfig.cmake.in
@@ -16,6 +16,7 @@ set(MLIR_IRDL_TO_CPP_EXE "@MLIR_CONFIG_IRDL_TO_CPP_EXE@")
 set(MLIR_INSTALL_AGGREGATE_OBJECTS "@MLIR_INSTALL_AGGREGATE_OBJECTS@")
 set(MLIR_ENABLE_BINDINGS_PYTHON "@MLIR_ENABLE_BINDINGS_PYTHON@")
 set(MLIR_ENABLE_EXECUTION_ENGINE "@MLIR_ENABLE_EXECUTION_ENGINE@")
+set(MLIR_ENABLE_CATALOG_GENERATOR "@MLIR_ENABLE_CATALOG_GENERATOR@")
 
 set_property(GLOBAL PROPERTY MLIR_ALL_LIBS "@MLIR_ALL_LIBS@")
 set_property(GLOBAL PROPERTY MLIR_DIALECT_LIBS "@MLIR_DIALECT_LIBS@")
diff --git a/mlir/include/mlir/IR/PatternMatch.h b/mlir/include/mlir/IR/PatternMatch.h
index 10cfe851765dc..141b3c6806ed8 100644
--- a/mlir/include/mlir/IR/PatternMatch.h
+++ b/mlir/include/mlir/IR/PatternMatch.h
@@ -475,6 +475,80 @@ class RewriterBase : public OpBuilder {
     RewriterBase::Listener *rewriteListener;
   };
 
+  struct CatalogingListener : public RewriterBase::ForwardingListener {
+    CatalogingListener(OpBuilder::Listener *listener,
+                       const std::string &patternName, raw_ostream &os,
+                       std::mutex &writeMutex)
+        : RewriterBase::ForwardingListener(listener), patternName(patternName),
+          os(os), writeMutex(writeMutex) {}
+
+    void notifyOperationInserted(Operation *op, InsertPoint previous) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationInserted"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationInserted(op, previous);
+    }
+
+    void notifyOperationModified(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationModified"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationModified(op);
+    }
+
+    void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with op)"
+           << " | " << op->getName() << " | " << newOp->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, newOp);
+    }
+
+    void notifyOperationReplaced(Operation *op,
+                                 ValueRange replacement) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with values)"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, replacement);
+    }
+
+    void notifyOperationErased(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationErased"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationErased(op);
+    }
+
+    void notifyPatternBegin(const Pattern &pattern, Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyPatternBegin"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyPatternBegin(pattern, op);
+    }
+
+  private:
+    const std::string &patternName;
+    raw_ostream &os;
+    std::mutex &writeMutex;
+  };
+
   /// Move the blocks that belong to "region" before the given position in
   /// another region "parent". The two regions must be different. The caller
   /// is responsible for creating or updating the operation transferring flow
diff --git a/mlir/lib/Rewrite/PatternApplicator.cpp b/mlir/lib/Rewrite/PatternApplicator.cpp
index 4a12183492fd4..c66aaf267881f 100644
--- a/mlir/lib/Rewrite/PatternApplicator.cpp
+++ b/mlir/lib/Rewrite/PatternApplicator.cpp
@@ -15,8 +15,19 @@
 #include "ByteCode.h"
 #include "llvm/Support/Debug.h"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/raw_ostream.h"
+#include <cxxabi.h>
+#include <mutex>
+#endif
+
 #define DEBUG_TYPE "pattern-application"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+static std::mutex catalogWriteMutex;
+#endif
+
 using namespace mlir;
 using namespace mlir::detail;
 
@@ -152,6 +163,16 @@ LogicalResult PatternApplicator::matchAndRewrite(
   unsigned anyIt = 0, anyE = anyOpPatterns.size();
   unsigned pdlIt = 0, pdlE = pdlMatches.size();
   LogicalResult result = failure();
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+  std::error_code ec;
+  llvm::raw_fd_ostream catalogOs("pattern_catalog.txt", ec,
+                                 llvm::sys::fs::OF_Append);
+  if (ec) {
+    op->emitError("Failed to open pattern catalog file: " + ec.message());
+    return failure();
+  }
+#endif
+
   do {
     // Find the next pattern with the highest benefit.
     const Pattern *bestPattern = nullptr;
@@ -206,14 +227,38 @@ LogicalResult PatternApplicator::matchAndRewrite(
           } else {
             LLVM_DEBUG(llvm::dbgs() << "Trying to match \""
                                     << bestPattern->getDebugName() << "\"\n");
-
             const auto *pattern =
                 static_cast<const RewritePattern *>(bestPattern);
-            result = pattern->matchAndRewrite(op, rewriter);
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            OpBuilder::Listener *oldListener = rewriter.getListener();
+            int status;
+            const char *mangledPatternName = typeid(*pattern).name();
+            char *demangled = abi::__cxa_demangle(mangledPatternName, nullptr,
+                                                  nullptr, &status);
+            std::string demangledPatternName;
+            if (status == 0 && demangled) {
+              demangledPatternName = demangled;
+              free(demangled);
+            } else {
+              // Fallback in case demangling fails.
+              demangledPatternName = mangledPatternName;
+            }
+
+            RewriterBase::CatalogingListener *catalogingListener =
+                new RewriterBase::CatalogingListener(
+                    oldListener, demangledPatternName, catalogOs,
+                    catalogWriteMutex);
+            rewriter.setListener(catalogingListener);
+#endif
+            result = pattern->matchAndRewrite(op, rewriter);
             LLVM_DEBUG(llvm::dbgs()
                        << "\"" << bestPattern->getDebugName() << "\" result "
                        << succeeded(result) << "\n");
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            rewriter.setListener(oldListener);
+            delete catalogingListener;
+#endif
           }
 
           // Process the result of the pattern application.
diff --git a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
index 23d89f41a3a45..281b2566304b0 100644
--- a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
@@ -44,6 +44,7 @@ expand_template(
         "@MLIR_ENABLE_SPIRV_CPU_RUNNER@": "0",
         "@MLIR_ENABLE_VULKAN_RUNNER@": "0",
         "@MLIR_ENABLE_BINDINGS_PYTHON@": "0",
+        "@MLIR_ENABLE_CATALOG_GENERATOR@": "0",
         "@MLIR_RUN_AMX_TESTS@": "0",
         "@MLIR_RUN_ARM_SVE_TESTS@": "0",
         "@MLIR_RUN_ARM_SME_TESTS@": "0",

Copy link

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Developer Policy and LLVM Discourse for more information.

@j2kun j2kun changed the title Add an MLIR "pattern catalog" generator [mlir][core] Add an MLIR "pattern catalog" generator Jun 28, 2025
@makslevental
Copy link
Contributor

I support this with every fiber of my being! Bravo.

My only comment (aside from getDebugName) is that it shouldn't mess with the filesystem and it should just dump to the screen. If you put the impl into it's own cpp file and then do #define DEBUG_TYPE "pattern-catalog", then people will be able to produce the catalog by just doing debug-only=pattern-catalog and piping (eg to a file on disk or whatever).

@j2kun
Copy link
Contributor Author

j2kun commented Jun 28, 2025

I support this with every fiber of my being! Bravo.

My only comment (aside from getDebugName) is that it shouldn't mess with the filesystem and it should just dump to the screen. If you put the impl into it's own cpp file and then do #define DEBUG_TYPE "pattern-catalog", then people will be able to produce the catalog by just doing debug-only=pattern-catalog and piping (eg to a file on disk or whatever).

I had originally tried something like this, and my obstacle was that I couldn't get the test suite to produce useful outputs because stuff was piped to FileCheck (when I tried just dumping to llvm::outs()), and I didn't have a way to modify the RUN command of all the lit tests in a systematic/pragmatic way to add a debug-only flag.

@makslevental
Copy link
Contributor

makslevental commented Jun 28, 2025

I had originally tried something like this, and my obstacle was that I couldn't get the test suite to produce useful outputs because stuff was piped to FileCheck (when I tried just dumping to llvm::outs()), and I didn't have a way to modify the RUN command of all the lit tests in a systematic/pragmatic way to add a debug-only flag.

If you make FileCheck -dump-input=always "settable" via env var (which shouldn't be too controversial?) then I think it could work (although you'd get the IR output too). Still I think that's better than relying on a filesystem and etc.

@makslevental
Copy link
Contributor

makslevental commented Jun 28, 2025

err sorry - you'd have to get debug-only to also be settable via env var. it occurs to me that this should be a cl::opt feature? like how pip works.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 28, 2025

Is not relying on a filesystem all that critical? The mutex is mildly annoying, but it's scoped pretty tightly, and the alternative requires changes in a few other components (wouldn't I also need to plumb these environment variables through the lit config?) and additional cleanup of the output.

I agree setting environment variables would be the cleanest approach. I'm just not confident in my ability to push a big change to cl::opt through, and using the filesystem seems good enough.

@makslevental
Copy link
Contributor

Is not relying on a filesystem all that critical?

I personally don't care (I think it's a reasonable use) - maybe someone else can comment on whether this kind of thing is acceptable.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 29, 2025

Spot checking a few examples, it looks like I also need to hook into the dialect conversion framework.

Copy link

github-actions bot commented Jun 30, 2025

✅ With the latest revision this PR passed the Python code formatter.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 30, 2025

Using ToolSubst I modified the PR to no longer use file writing and/or the mutex. It's quite simple now, and the last two steps before this PR can feasibly be merged are:

  1. Add a test that ensures the catalog output is printed.
  2. Ensure the conversion rewriter patterns are working (it seems replaceOp on conversion pattern rewriter is not notifying the listener)

j2kun and others added 2 commits June 30, 2025 10:29
Copy link

github-actions bot commented Jun 30, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- mlir/lib/IR/PatternLoggingListener.cpp mlir/include/mlir/IR/PatternMatch.h mlir/lib/Rewrite/PatternApplicator.cpp
View the diff from clang-format here.
diff --git a/mlir/lib/IR/PatternLoggingListener.cpp b/mlir/lib/IR/PatternLoggingListener.cpp
index 735997bbd..f5dc332ac 100644
--- a/mlir/lib/IR/PatternLoggingListener.cpp
+++ b/mlir/lib/IR/PatternLoggingListener.cpp
@@ -15,7 +15,8 @@ void RewriterBase::PatternLoggingListener::notifyOperationInserted(
   ForwardingListener::notifyOperationInserted(op, previous);
 }
 
-void RewriterBase::PatternLoggingListener::notifyOperationModified(Operation *op) {
+void RewriterBase::PatternLoggingListener::notifyOperationModified(
+    Operation *op) {
   LLVM_DEBUG(llvm::dbgs() << catalogPrefix << patternName
                           << " | notifyOperationModified"
                           << " | " << op->getName() << "\n");
@@ -39,7 +40,8 @@ void RewriterBase::PatternLoggingListener::notifyOperationReplaced(
   ForwardingListener::notifyOperationReplaced(op, replacement);
 }
 
-void RewriterBase::PatternLoggingListener::notifyOperationErased(Operation *op) {
+void RewriterBase::PatternLoggingListener::notifyOperationErased(
+    Operation *op) {
   LLVM_DEBUG(llvm::dbgs() << catalogPrefix << patternName
                           << " | notifyOperationErased"
                           << " | " << op->getName() << "\n");


using namespace mlir;

static constexpr StringLiteral catalogPrefix = "PatternLoggingListener: ";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pattern we used in other files is something like this:

#define DEBUG_TYPE "liveness-analysis"
#define DBGS() (llvm::dbgs() << '[' << DEBUG_TYPE << "] ")
#define LDBG(X) LLVM_DEBUG(DBGS() << X << "\n")

And then adjusting the macro usage.

(I think it'd be best if we standardize it in the codebase by moving these macros in a common header though)

"mlir-opt --debug-only=pattern-logging-listener --mlir-disable-threading",
unresolved="fatal",
),
ToolSubst("FileCheck", "FileCheck --dump-input=always", unresolved="fatal"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the FileCheck substitution relates to the catalog environment? Is this because some tests are redirecting stderr and you're using this to preserve the debug output?

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That LGTM, but @jpienaar is out for a few days and we should wait since he worked on an alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants